{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 零基础实战机器学习\n",
    "\n",
    "## 第18讲 增长模型裂变实战\n",
    "\n",
    "作者 黄佳\n",
    "\n",
    "极客时间专栏链接：https://time.geekbang.org/column/intro/438\n",
    "\n",
    "\n",
    "问题：判断哪种裂变带来更大的增长幅度\n",
    "\n",
    "营销团队为易速鲜花的用户设计了两种裂变方案——“情侣花享”和“拼团盛放”，一个是类似于买一送一的促销，让用户把促销分享给自己的另一半。另一个裂变方案是用过生成专属海报进行拼团购买，团越大促销折扣越大。\n",
    "\n",
    "那么，如何让机器自动的识别出，某一种裂变（或者说其它促销方案）所最易感的人群，然后把该裂变发给他或她呢？\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据导入"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "用户数: 64000\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>用户码</th>\n",
       "      <th>曾助力</th>\n",
       "      <th>曾拼团</th>\n",
       "      <th>曾推荐</th>\n",
       "      <th>设备</th>\n",
       "      <th>城市类型</th>\n",
       "      <th>R值</th>\n",
       "      <th>M值</th>\n",
       "      <th>裂变类型</th>\n",
       "      <th>是否转化</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>电脑</td>\n",
       "      <td>中小</td>\n",
       "      <td>10</td>\n",
       "      <td>138.00</td>\n",
       "      <td>情侣花享</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>电脑</td>\n",
       "      <td>一线</td>\n",
       "      <td>4</td>\n",
       "      <td>105.59</td>\n",
       "      <td>拼团盛放</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>电脑</td>\n",
       "      <td>一线</td>\n",
       "      <td>1</td>\n",
       "      <td>494.13</td>\n",
       "      <td>拼团盛放</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>手机</td>\n",
       "      <td>二线</td>\n",
       "      <td>10</td>\n",
       "      <td>148.45</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>手机</td>\n",
       "      <td>二线</td>\n",
       "      <td>2</td>\n",
       "      <td>337.18</td>\n",
       "      <td>拼团盛放</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   用户码  曾助力  曾拼团  曾推荐  设备 城市类型  R值      M值  裂变类型  是否转化\n",
       "0    1    0    1    0  电脑   中小  10  138.00  情侣花享     0\n",
       "1    2    0    1    0  电脑   一线   4  105.59  拼团盛放     0\n",
       "2    3    0    1    0  电脑   一线   1  494.13  拼团盛放     1\n",
       "3    4    0    1    1  手机   二线  10  148.45  没有促销     0\n",
       "4    5    1    0    0  手机   二线   2  337.18  拼团盛放     0"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd #导入Pandas\n",
    "import numpy as np #导入NumPy\n",
    "df_fission = pd.read_csv('易速鲜花增长模型.csv') #载入数据\n",
    "print('用户数:', df_fission.count()['用户码']) #查看数据条目数\n",
    "df_fission.head() #显示头几行数据"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 可视化"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\jacky.huang\\Anaconda3\\lib\\site-packages\\seaborn\\_decorators.py:36: FutureWarning: Pass the following variable as a keyword arg: x. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.\n",
      "  warnings.warn(\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEWCAYAAAB42tAoAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAdeklEQVR4nO3de7iVdZ338fcHAcFIBNwmUsgoZQcVy43pI41AQo2CZXZguhwNr2eYZzrYjOZMTmNWk89kdT1GOlFMNRVDR80uxUw5aWEJYQYppWlDSqltwsADkofv88f3t2uz2ftmK6x7sVmf13Wta6/1u9da+7e42fdn/Q73/VNEYGZm1psBza6AmZnt2RwUZmZWyUFhZmaVHBRmZlbJQWFmZpUcFGZmVmlgsyuwux144IExbty4ZlfDzKxfue222zZGRFtP2/a6oBg3bhyrV69udjXMzPoVSb/ubZu7nszMrJKDwszMKjkozMyskoPCzMwqOSjMzKySg8LMzCo5KMzMrJKDwszMKjkozMys0l53ZvazdewFX2l2FfZ6t33irIa8730fOaoh72t/NvaDP2vYe594+YkNe29Lt7znlt3yPm5RmJlZJQeFmZlVclCYmVklB4WZmVVyUJiZWSUHhZmZVXJQmJlZJQeFmZlValhQSPqypFslXSNpmKRFktZIWqA05LmWNarOZma2o4YEhaRJwMCIOB7YHzgH2BARE4ARwDTgzF0oMzOzmjSqRfEQMLfL7/gQsLg8XgZMAabuQpmZmdWkIUEREb+MiFWSTgeeAW4HNpfNW4CRwKhdKDMzs5o0coziNOBcYCbwIDC8bBoObCy351rW/XfNkbRa0uqOjo7d/2HMzFpYo8YoDgYuAGZExCPAUmB62TwVWL6LZduJiPkR0R4R7W1tbbv/A5mZtbBGtSjOBkYDN0haAQwCxkhaC2wiD/4Ld6HMzMxq0pD1KCLiUuDSbsWf6/Z4GzDjOZaZmVlNfMKdmZlVclCYmVklB4WZmVVyUJiZWSUHhZmZVXJQmJlZJQeFmZlVclCYmVklB4WZmVVyUJiZWSUHhZmZVXJQmJlZJQeFmZlVclCYmVklB4WZmVVyUJiZWSUHhZmZVWpYUEgaJOnacn+ypBXldr+ksyVNlLShS/kRkoZIWiRpjaQFSjuUNarOZma2o4YEhaShwG3ANICIuCkiJkXEJGAtcDswApjXWR4RdwFnAhsiYkLZPq2XMjMzq0lDgiIitkbE0cCGruWS9gPGR8Ra8qB/hqRVkq4qLYWpwOLy9GXAlF7KzMysJnWPUUwDlpb79wAXRcRxwGjgJGAUsLls3wKM7KVsO5LmSFotaXVHR0cDq29m1nrqDoqZwKJyfz2wpMv9g4CNwPBSNrw87qlsOxExPyLaI6K9ra2tIRU3M2tVtQVF6VqaTHYfAZwHzJI0ADgSuINsbUwv26cCy3spMzOzmtTZopgIrIuIJ8rjK4DZwErg6ohYBywExkhaC2wiQ6KnMjMzq8nARr55RIzvcn8VcFqXxw+QLYyuz98GzOj2Nj2VmZlZTXzCnZmZVXJQmJlZJQeFmZlVclCYmVklB4WZmVVyUJiZWSUHhZmZVXJQmJlZJQeFmZlVclCYmVklB4WZmVVyUJiZWSUHhZmZVXJQmJlZJQeFmZlVclCYmVmlhgWFpEGSri33J0raIGlFuR0haYikRZLWSFqg1KeyRtXZzMx21JCgkDQUuA2YVopGAPMiYlK53QWcCWyIiAll+7RnUWZmZjVpSFBExNaIOBrYUIpGAGdIWiXpqtIqmAosLtuXAVOeRZmZmdWkrjGKe4CLIuI4YDRwEjAK2Fy2bwFGPouy7UiaI2m1pNUdHR0N+xBmZq2orqBYDyzpcv8gYCMwvJQNL4/7WradiJgfEe0R0d7W1taA6puZta66guI8YJakAcCRwB3AUmB62T4VWP4syszMrCZ1BcUVwGxgJXB1RKwDFgJjJK0FNpGB0NcyMzOrycBGvnlEjC8/HwAmd9u2DZjR7SV9LTMzs5r4hDszM6vkoDAzs0oOCjMzq+SgMDOzSg4KMzOr5KAwM7NKDgozM6vkoDAzs0oOCjMzq+SgMDOzSg4KMzOr5KAwM7NKDgozM6vkoDAzs0oOCjMzq+SgMDOzSg0LCkmDJF3b5fGXJd0q6RpJAyVNlLRB0opyO0LSEEmLJK2RtEBph7JG1dnMzHbUkKCQNBS4DZhWHk8CBkbE8cD+5BrYI4B5ETGp3O4CzgQ2RMSEsn1aL2VmZlaThgRFRGyNiKOBDaXoIWBut985AjhD0ipJV5WWwlRgcdm+DJjSS5mZmdWkljGKiPhlRKySdDrwDHAjcA9wUUQcB4wGTgJGAZvLy7YAI3spMzOzmgys6xdJOg04F5gZEU9JWg/cUTavBw4CNgLDS9nw8nhYD2Xd33sOMAdg7NixjfkAZmYtqpYWhaSDgQuAGRHxSCk+D5glaQBwJBkaS8nxC8gup+W9lG0nIuZHRHtEtLe1tTXug5iZtaC6pseeTXYv3VBmOJ0DXAHMBlYCV0fEOmAhMEbSWmATGRI9lZmZWU0a2vUUEePLz0uBS3t4yuRuz98GzOj2nJ7KzMysJj7hzszMKlW2KCQdDjxNzlTaYTN5bsS9jaiYmZntGXbW9bQCuJ4MhdcBN3T7+T3gnEZW0MzMmmtnQfGLiDgHQNLyiJjd7adDwsxsL7ezMYro4X709EQzM9s77SwofAE+M7MW92xaFM9mm5mZ7SV2FhTHSvqBpMXAEZJu7PLzZaXczMz2YjsbzB4TEY/2ttFrQ5iZ7f0qWxQR8aikgyS19/KU3srNzGwv0ZdLeIwFviLpVuA3wO3kuRVTgX8BTmxc9czMrNl2dmb2aPKs7K8B84AXAW8s9+8HTm5w/czMrMl21qL4GHAU8DC5DsSRwGDgbcC7gOPJs7PNzGwvtbOgeC+5qtyFwCuA9RHxCQBJ9wLXSropIp5obDXNzKxZdhYUfws8Tq5TvRX4T0nXAN8C3gm8wyFhZrZ329l5FAeQS5S+ELiLPFP7eeSSpIOANY2snJmZNd/OguJ64FZydbqDgJnAEcAxwDeB9/X2QkmDJF1b7g+RtEjSGkkLlJ5z2e744GZm1jc7C4pJ5HoUK8k1rX8C3Af8NCI+DrxG0j7dXyRpKHAbMK0UnQlsiIgJwIhSvitlZmZWk8oxioj4GICke4BfA6OABRExrzzlEnq4cGBEbAWOLq+DPOfiqnJ/GTnmcegulN3Y509oZma7pE9LoUbEryLi6Yj4XZeQICJWRsRTfXiLUcDmcn8LMHIXy8zMrCZ9OTN7d9hIDoBTfm4Ehu1C2XYkzQHmAIwdO3b3197MrIX1qUWxGywFppf7U4Hlu1i2nYiYHxHtEdHe1tbWkA9gZtaq6gqKhcAYSWuBTeTBf1fKzMysJg3teoqI8eXnNmBGt827UmZmZjWpq0VhZmb9lIPCzMwqOSjMzKySg8LMzCo5KMzMrJKDwszMKjkozMyskoPCzMwqOSjMzKySg8LMzCo5KMzMrJKDwszMKjkozMyskoPCzMwqOSjMzKySg8LMzCo5KMzMrFJtQSFpsqQV5Xa/pIslbehSdoSkIZIWSVojaYHSDmV11dnMzGoMioi4KSImRcQkYC3wMDCvsywi7gLOBDZExARgBDCtlzIzM6tJ7V1PkvYDxgMPAWdIWiXpqtJSmAosLk9dBkzppczMzGrSjDGKacBS4B7goog4DhgNnASMAjaX520BRvZSth1JcyStlrS6o6OjwdU3M2stzQiKmcAiYD2wpJStBw4CNgLDS9nw8rinsu1ExPyIaI+I9ra2toZV3MysFdUaFKV7aTLZhXQeMEvSAOBI4A6ypTG9PH0qsLyXMjMzq0ndLYqJwLqIeAK4ApgNrASujoh1wEJgjKS1wCYyJHoqMzOzmgys85dFxCrgtHL/AbJ10XX7NmBGt5f1VGZmZjXxCXdmZlbJQWFmZpUcFGZmVslBYWZmlRwUZmZWyUFhZmaVHBRmZlbJQWFmZpUcFGZmVslBYWZmlRwUZmZWyUFhZmaVHBRmZlbJQWFmZpUcFGZmVslBYWZmlWoLCkkTJW2QtKLcJkhaJGmNpAVKQ/pSVledzcys3hbFCGBeREyKiEnksqgbImJC2TYNOLOPZWZmVpM6l0IdAZwh6Q3A/cAfgSvLtmXAFOBQ4Ko+lN1YU53NzFpenS2Ke4CLIuI4YDTwJmBz2bYFGAmM6mPZdiTNkbRa0uqOjo7GfQIzsxZUZ1CsB5Z0uf8MMLw8Hg5sLLe+lG0nIuZHRHtEtLe1tTWi7mZmLavOoDgPmCVpAHAkcD4wvWybCiwHlvaxzMzMalJnUFwBzAZWAlcDXwDGSFoLbCIDYWEfy8zMrCa1DWZHxAPA5G7FM7o93tbHMjMzq4lPuDMzs0oOCjMzq+SgMDOzSg4KMzOr5KAwM7NKDgozM6vkoDAzs0oOCjMzq+SgMDOzSg4KMzOr5KAwM7NKDgozM6vkoDAzs0oOCjMzq+SgMDOzSg4KMzOrVGtQSPqypFslXSNpoqQNklaU2xGShkhaJGmNpAVKO5TVWWczs1ZXW1BImgQMjIjjgf2B0cC8iJhUbncBZwIbImICMAKY1kuZmZnVpM4WxUPA3C6/dwRwhqRVkq4qLYWpwOLynGXAlF7KzMysJrUFRUT8MiJWSTodeAb4BXBRRBxHti5OAkYBm8tLtgAjeynbjqQ5klZLWt3R0dHgT2Jm1lrqHqM4DTgXmAncAywpm9YDBwEbgeGlbHh53FPZdiJifkS0R0R7W1tbw+pvZtaK6hyjOBi4AJgREY8A5wGzJA0AjgTuAJYC08tLpgLLeykzM7Oa1NmiOJvsYrpB0grgcWA2sBK4OiLWAQuBMZLWApvIkOipzMzMajKwrl8UEZcCl3YrvqTbc7YBM7o9p6cyMzOriU+4MzOzSg4KMzOr5KAwM7NKDgozM6vkoDAzs0oOCjMzq+SgMDOzSg4KMzOr5KAwM7NKDgozM6vkoDAzs0oOCjMzq+SgMDOzSg4KMzOr5KAwM7NKDgozM6vUL4JC0hBJiyStkbRAkppdJzOzVtEvggI4E9gQEROAEcC0JtfHzKxl9JegmAosLveXAVOaWBczs5bSX4JiFLC53N8CjGxiXczMWsrAZlegjzYCw8v94eXxn0iaA8wpDx+VdFeNdavbgXT7/Hs6ffLsZldhT9K/9t/FHg7son/tO0DnPqv9d2iv7xMRu16bBpN0DvDqiPg7SdcBl0XEkmbXqxkkrY6I9mbXw54b77/+q5X3XX/peloIjJG0FtgELG1yfczMWka/6HqKiG3AjGbXw8ysFfWXFoX92fxmV8B2ifdf/9Wy+65fjFGYmVnzuEVhZmaVHBQtxJc+MbPnwkHRAiTtL+nVwIRm18WeHUkvaHYdbNf19y9pDoq9lKRBkt4i6V+BI4EXAi+TNKjJVbM+kvRm4APl/khJR/f3A06rkTRa0ijguPK4X+4/B8VeRmk08GrgMPLSJ8eSZ67/PiKelLRPM+toPSv77nRJ+5eiR4H7JX0UuA14XUREfz3YtBpJrwQ+BLwYuFiSop/OHnJQ7AU6DxyS9gUmkd9eRgJPAu3AKcBTwBRJwyLi6WbV1XpXDiJHAidKOgp4FfAa4H+AS4BbuzzP9kCSTiw/JwOTgaHldgnw8rKt3x13+8UJd1aty4FjMLAPcBBwNtmauBe4HXg+cCrwI+AaSQMi4pkmVNcKSX8BPBERD3Qpvh14D7nf1gAvAb4B/CWwv/fbnkvSYGBuuYLEd4AlwH7A6eSFTQ8DTuiP+89B0Y9J+ivgL4BnyAPJRcAPIuImSROB35AXMpsKDALuAR6XdAHwK+CqplS8hZVvk4eQ+w3g7ZJ+DlwOvBWYDfweuBQYC/wSmEUGRkdEXNefuzD2JpLGAm8EboqItWSLYQPZeniiPG0G8GPgZmA/SaeWfdivAr/fNYFamaR9JD1f0vvLt5e/Af4ZGA/8c0TcS4YGZEicDNwN/CNwP9lX+m5gX+CGuuvfyjq7B8vBYQzwdbLVNxh4rBz4rwPeTB5YbiFbFmvJb6YHAjMljSvjFP7brVnnRBBJL5M0F/gv4BhynwEMAbYBfw98hmzFrySvOPt54Ebyb5b+FBLgoOhXytjCAcCHgZuA1eS3l0sj4n5Jh5aDyGDyoHMycC3wd+Q4xfcj4o0R8dGIeLQZn6FVlf1ysKQ3kF2Ay8mDyGDgt+Vp+5f98kvyQPMPwAPAH8iFu8YD75J0NdnCsJqUCSJnShoPLCC7BadHxDlkS+ElEXErcAHwGLnf/gh8Fzgc+BwwEzhE0knlPfvNpAR3Pe3hJB0PrIuILZI+SHYjfRq4PyI+Lel84GlJnwTukfTFiPijpK3Al8h9/E1yZcCx5T0HkMcud180kKSzgeURcZ+kU8nW3GhycHod2ar7HLBZUhu5r74IbCUPMvsCzyPPfzmcbIEsByZGxC9q/jgtp1sX3+/IlsI1ZDfSDyPiaUmnkH9XBwN3l329Cvgk8AjwKbKlsTUiTpH0f8guxpv709+fWxR7oHIOxDslrQD+BZheNn07IiZHxAXA6yQdQHZJnEp+M70yIv5YnjsS+FpEvCUivkWuO76PpH0i4pn+9J+0P5E0WNK7JV0PfJBs/QH8lJxccCVwFjCOXNL328D1wFvILimAn5fH28jxjCPI/fkz4OmI+FEdn6VVdekm/NPfSGnNryRnFS4gp7teDXwVmBsR3y+tDoCHya7FocDRwBXk5ATI2Yefr+Nz7E5uUewhJD2f7O98CriM/ObyDnKmxOslXR8Rd5Tnnkj2a/9B0mXkf9ZVwKbyPi8CbomIZ7p8K3p3RPyh5o/VMiSNBN4JTAPuAs6LiJ9LulnSIRHxG0kd5AHjxeSSvhvIoBhKhsFbytv9npyd1kYecH4REQ/X+oFaVNdBZknTgQMj4quShpIrwM0gz0m6ihz3WwmcI+kj5Oy0+RHxQ+CHku4l9/HXgRdJ2jci+l1IgFsUe5Jh5EyXB8nBr2vIbodjgO+T0147tZP/8Y4kZzsdAywu/8FnAnO6hQQOicYoLYhPkweO5wFvBz5OnsMCcCc5RgTwNeCNEfET4N/IluKHgdPI/X2xclnfV5F92+si4kcOicbqnBgg6Tjgs5Iml9bCLPLvsdOj5BUOTgNWAK8APgu8EvjriJjf9f3IlseNEbE+Ir5a1tXplxwUTVRmMbVJaitz6R8jz4FYSA5kjiAHrI8CPiZpaOlueinwE+C1ZAvkEuBNkq4k+7HvBZ+Y1UiSDpP08tLVdyzwmYi4MCJ+Q4b3jyQtBK4GpkkaRg5w/kzS/wV+QHY1XUFOVR5DflsdDKyMiKX9bWZMf9Xl3/lo4K/J8x7eT44pbSrbBgAXkl3Bp0TEg+Tf6EDyrPlx5f/E+8gWIsBHypeCfs/rUTRB+cbxQbL7obOLYSk5s2VpRFwk6RhyKt1RZPN2DPCziLisvMcB5LfRL0TEWklnAIMi4us1f5yWImkcOfX4fGBwRHyk/NvPIvfXfLJF0AacFRE3lO7B+yLiMuV1f95EthDbyFAZAnwuIv679g/UoiQdRk4IeVLS6cB7AZEzzs4j9+X/JgeqXxQRj3e20EvQrwYmktPRjwBeQE6H/Sbw73vblzSPUdRI0sHlm8iLydbDm8nm6ziyX1vACWUM4uXl9qbOqaySlpDjF5TxicfI/6RrI8InzzWQpNGl1XcC2QVxJ/CBEtjvI/utTyG/cU4BTgSOJ2c2fQi4TtKnIuL35YzsGcB6ciB0Ub2fpnVJ+l/AueTfzcfJ7sBfkX+LB5D762ry7/EfgXcBbyPPmRAQwLfILwpPk+cxAbwwIlbX9Tnq5q6nmkh6DfCJ8vAEsvvoJeTYxHfIg8u7yCmtbyWns94dEY9KGifpQ8DLy/S6Th8tM5qsgSRNIsePICcNvJ0MhV8Dd5Sui1uAUyNiA7kvP0vur9eS31BfTk6vhJxq+Z6IOM0hUQ9JB5aJHn8L/IKcKfhPpXWwOSI2kpMJgmwVfALooOwr2K6Lal0pv6+89sG9OSTAXU+1knQ5OUXyGfI/5BPk/OtDgEcjYrak4eR5Ep8hDzhvIAfUvktOm5wcER9tQvVbmqQvkie+/Zq8/EYHeZG+zwN3kJcD/w8yFB4ivwx8gJzBdCV5cDk5Ij5Qd91bWQmHd5Ot+A+TX8IWR8RPJb2fbNVdGxGPSfoMeU2mR8jL3Uwi//bGA5+KiO+U6eVPSzqglSaIuOupJsqrgR5M9mNeTv4nPIVsvq4CXizp5IhYIulmsmvi28DfRMQJXd5qRb01b21lPOlt5Fnud0bEeZLayYP/a8hvp/PKONE3yZbH78nzHj7ebdxheb21t4h4BPh3SaeRATAWOEXSQHIG0xTgJ5IeIcPkQ+Q10w4lB6N/IOliclzpO+V8ipabReiup/o8jxy4hhzE/iF56YYfky2Kw8jWA+Sg9rHl/juVvIZEE5TuhuVkn/ZvJV1Ijj3cSHZRnEPOlIEcyD4YWBIRr+0Mif50qYa9UZn2+nryEu5zybVaFpOzBvcDRpTxpzdFxC3kF7RrIuIH5S3uJceeWpaDoj6/Ii/uthyYFRE/Jpu9p5En5ZxMHlMmRsT95KyoNRFxcySvIdE8D5HdD/uQJ2BdQZ5cd0I5wISkk8o8+W+Q3Uw9nuFrTfFjchryMLLb9zFyluHZ5Gy1g+BPrQ/ImU8vkDSmlP93GXtqWe56qklE/E7SJvLbyVxJLyQHrDeRfdsHkLNpNpXn79WDY/1JmRL5FNmCeIOkw8mWw0PlKcvIwL85Ij7c9XW1V9Z2UPbDOmCdpL+MiLcr1yL/PNlF+BlJU4BDI+JL5NjTzT6P5c/coqhRRDwVEY+RJ1u9mTxR50XkuRIPR8T7Iy8VbnueuyPiu+RVQ18L7E9OSCAivhQR5zezctZnsyUdHhEPAfPI8aTnkxMPBgBExFaHxPYcFM2xhDzRSsD/i4g1/va5ZyszXUQuAPU78uqunZds8BhEPyDpEHK22iOlBfH3wH9ExGcj4uSI+GJza7jn8vTYJimX7ehodj3s2VGuhbzG11/qf8rU82XASeQ10bZExHXNrVX/4KAws5Yh6R3kmdUREY83uTr9hoPCzMwqeYzCzMwqOSjMzKySg8JsJyQNeg6vGbyLv3Ngt8c+M9+axkFhVqEsgXmzpNsl3SfpJklzJS2UdIikpeV5+3R5zbuBf+3yeGDXKbSSLpR0gKQ5kmYrV8mbK2lI2T4EWCTpIUmLJP2WXG7zFWX79fV8erPkoDCrEBFbyYv/3U1eowty0ZonyXMptkpqA5ZKWiLpTnJ+/hmS7ixBshR4Vblm12DyhMvp5fVPUlbEAzqXr30iIl5PrnQ3A/h+RPwn0HnV4M56mNXCl/AwqyDpQHLt4/8hr/r7NuB7wFTy5DsiokPSTDIgftftLYYBX4mILWVVtf8CtgKjyUWr7iSXSIVcK2GWpLHkXP/DJP0D8BJJZwOday57qqLVyi0KswplQZu3kxd1/DDwBWDfiJhNnrTV2aX0aWBjl8edNpLdSMMi4lcRcRIZKPeTLYTLgYeBz5YrznaU1zxGLozzU3INk7vJJVPNauegMNu515FXjz2qPD6j/NwMXCJpP/JKwMPINUZOIVdQ+yvy4H4J8GTperqYXJ/5LPIqpQPJ9ZkPl9R5vaiHyu/8RETcBGyNiB8BF0s6nh3DyKyhfMKdWQVJrwQ+SXYXDSAXt7mry1P2BS4ku4M6r/j7WPm5D3B5RPxTea9jyRbERvJy14eRQbKuPH9I2f5bYHznMrdlPYXDIuLrkt4K/Cwifr77P61ZzxwUZn0g6aXk5eDnAZPJA/olwNnlisCdz5tLdhctJJezPT8iNvfwfgeQF4d8EHh/RNzRbfux5Gp5T5DLc+5PjpMMAv4tIrzSodXGQWFWQdJoctGbbcBLgXPJWUffIFcp/GBE/KbL80eQC+VsJQPgui7bRHZfTSG7ns4nl1K9kmyNfA+4BXic/Nt8qrxuMnBMRHyqPB4ADOjcbtZoDgqznZA0nlyFcD05a+kjZNfRWeQyqGeRLY1h5EF+CTn76VRyXeahwMyIeFDSZeW9Pt/Z0ijnYJxOrnPxXqCdXOGwtyAYAHwxIq7c3Z/VrCcOCrM+kjS0nFfR07bhPXUxlW1DIuKJxtbOrHEcFGZmVsnTY83MrJKDwszMKjkozMyskoPCzMwqOSjMzKzS/wdXbBM+MrrLWQAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt #导入pyplot模块\n",
    "import seaborn as sns #导入Seaborn\n",
    "fig = sns.countplot('裂变类型', data=df_fission) #创建柱状计数图\n",
    "fig.set_xticklabels(fig.get_xticklabels(),rotation=25) #X轴标签倾斜\n",
    "fig.set_ylabel(\"数目\") #Y轴标题\n",
    "plt.show() #显示图像"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<BarContainer object of 3 artists>"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD4CAYAAAAAczaOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAVdElEQVR4nO3df5BdZ33f8fdHSCDzS8jyOhiDUVsFGyNwaGSPaZVYFsgktcxAnFKHOoBp6qZph4Jjp07ACZMfxU5cMp64OHFLG2LcH2ldk1g0gCybTJSAFRkqxSgx2IkApdSsI7BhYgTG3/5xni3LZnfvlbQrWXrer5k7+5xznvOcc/be/dznPOfcu6kqJEl9WHK0d0CSdOQY+pLUEUNfkjpi6EtSRwx9SerI0qO9A/M56aSTavXq1Ud7NyTpmHLvvfc+XFUTsy17Uof+6tWr2blz59HeDUk6piT53FzLHN6RpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOPKk/kXs4Vl/9oaO9C8etvddeeLR3QdIhsqcvSR0x9CWpI4a+JHXE0Jekjhj6ktSReUM/yfIkW5LsSnJLksxRb1mSO6ZNb0iyvT2+kORNSc5Osm/a/NMX+mAkSfMb1dO/FNhXVWcBK4FNMyskOQG4d/qyqvpYVa2vqvXAbuBTbf2bpuZX1f0LdRCSpPGMCv2NwNZWvgs4f2aFqnqsql4G7Ju5LMnTgTVVtZsh9C9OsiPJbXOdNUiSFs+o0F8FPNLKjwInHmT7m4BtrfwAcE1VnQOcApw32wpJLk+yM8nOycnJg9ycJGk+o0L/YWBFK69o0wfjImBLK+8F7pxWPnm2Farq5qpaV1XrJiZm/b++kqRDNCr0twEXtPJG4O5xG27DNxsYhoUArgAuSbIEWAvcd1B7Kkk6bKNC/1bg1CS7gf3Ag0muH7Pts4E9VfX1Nn0jcBlwD3B7Ve05lB2WJB26eb9wraoOAJtnzL5yjrprZkzvAF4zbfqLDD1/SdJR4oezJKkjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUkXlDP8nyJFuS7EpyS5LMUW9ZkjumTZ+dZF+S7e1x+rhtSZIWz9IRyy8F9lXV5iRbgE3AR6dXSHICcA/wommzVwI3VdUvTav3Y6PaUt9WX/2ho70Lx6291154tHdBTxKjhnc2Altb+S7g/JkVquqxqnoZsG/a7JXAxUl2JLmt9epHtiVJWlyjQn8V8EgrPwqcOGa7DwDXVNU5wCnAeeO2leTyJDuT7JycnBxzc5KkcYwK/YeBFa28ok2PYy9w57TyyeO2VVU3V9W6qlo3MTEx5uYkSeMYFfrbgAtaeSNw95jtXgFckmQJsBa47zDakiQtkFGhfytwapLdwH7gwSTXj9HujcBlDBd4b6+qPbO0te3Qd1uSdCjmvXunqg4Am2fMvnKOumumlb8IbBijLUnSEeSHsySpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1JF5Qz/J8iRbkuxKckuSzFFvWZI7Zsx7f5JPJPndJEuTnJ1kX5Lt7XH6Qh6IJGm0UT39S4F9VXUWsBLYNLNCkhOAe6cvS7IeWFpV5wLPBi5o699UVevb4/4FOgZJ0phGhf5GYGsr3wWcP7NCVT1WVS8D9k2b/RBww4xtrAQuTrIjyW1znTVIkhbPqNBfBTzSyo8CJ47TaFV9tqp2JHkd8ATwUeAB4JqqOgc4BThvtnWTXJ5kZ5Kdk5OT42xOkjSmUaH/MLCilVe06bEkeQ3wVuCiqnoc2Avc2RbvBU6ebb2qurmq1lXVuomJiXE3J0kaw6jQ38YwHg/DUM/d4zSa5LnAVcDmqvpqm30FcEmSJcBa4L6D311J0uEYFfq3Aqcm2Q3sBx5Mcv0Y7b6JYQjnI+1OnbcANwKXAfcAt1fVnsPYb0nSIVg638KqOgBsnjH7yjnqrplWvg64bpZqGw5y/yRJC8gPZ0lSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JH5v3PWZI0n9VXf+ho78Jxa++1Fy5Ku/b0Jakjhr4kdcTQl6SOzBv6SZYn2ZJkV5JbkmSOesuS3DHfeuO2JUlaPKN6+pcC+6rqLGAlsGlmhSQnAPfOWDbbeiPbkiQtrlGhvxHY2sp3AefPrFBVj1XVy4B9I9Yb2ZYkaXGNCv1VwCOt/Chw4pjtzrbeWG0luTzJziQ7Jycnx9ycJGkco0L/YWBFK69o0+OYbb2x2qqqm6tqXVWtm5iYGHNzkqRxjAr9bcAFrbwRuHvMdmdb71DbkiQtkFGhfytwapLdwH7gwSTXj9HuzPW2zTFPknQEzfs1DFV1ANg8Y/aVc9RdM2K92eZJko4gP5wlSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOzBv6SZYn2ZJkV5JbkmScOkk2JNneHl9I8qYkZyfZN23+6Yt3WJKk2Yzq6V8K7Kuqs4CVwKZx6lTVx6pqfVWtB3YDn2rLbpqaX1X3L9xhSJLGMSr0NwJbW/ku4PyDqZPk6cCaqtrNEPoXJ9mR5LbZzhokSYtrVOivAh5p5UeBEw+yziZgWys/AFxTVecApwDnzbbBJJcn2Zlk5+Tk5OgjkCSNbVToPwysaOUVbfpg6lwEbGnlvcCd08onz7bBqrq5qtZV1bqJiYkRuydJOhijQn8bcEErbwTuHrdOG77ZwDDkA3AFcEmSJcBa4L5D3mtJ0iEZFfq3Aqcm2Q3sBx5Mcv2IOlPDOWcDe6rq6236RuAy4B7g9qrasxAHIEka39L5FlbVAWDzjNlXjlGHqtoBvGba9BcZev6SpKPED2dJUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakj84Z+kuVJtiTZleSWJBmnTpKzk+xLsr09Th+nLUnS4hrV078U2FdVZwErgU1j1lkJ3FRV69vj/jHbkiQtolGhvxHY2sp3AeePWWclcHGSHUlua736cdqSJC2iUaG/CniklR8FThyzzgPANVV1DnAKcN6YbZHk8iQ7k+ycnJwc9zgkSWMYFfoPAytaeUWbHqfOXuDONm8vcPKYbVFVN1fVuqpaNzExMfoIJEljGxX624ALWnkjcPeYda4ALkmyBFgL3DdmW5KkRTQq9G8FTk2yG9gPPJjk+hF1tgE3ApcB9wC3V9WeOepJko6gpfMtrKoDwOYZs68co84XgQ1j1JMkHUF+OEuSOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI/OGfpLlSbYk2ZXkliQZt06S9yf5RJLfTbI0ydlJ9iXZ3h6nL9ZBSZJmN6qnfymwr6rOAlYCm8apk2Q9sLSqzgWeDVzQlt1UVevb4/4FOwpJ0lhGhf5GYGsr3wWcP2adh4AbZmxjJXBxkh1JbpvtrEGStLhGhf4q4JFWfhQ4cZw6VfXZqtqR5HXAE8BHgQeAa6rqHOAU4LzZNpjk8iQ7k+ycnJw8uKORJM1rVOg/DKxo5RVteqw6SV4DvBW4qKoeB/YCd7Z6e4GTZ9tgVd1cVeuqat3ExMR4RyFJGsuo0N/GMB4PwzDO3ePUSfJc4Cpgc1V9tS27ArgkyRJgLXDf4ey4JOngjQr9W4FTk+wG9gMPJrl+RJ1twJsYhnA+0u7UeQtwI3AZcA9we1XtWcDjkCSNYel8C6vqALB5xuwrx6hzXXvMtOEg90+StID8cJYkdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjoyb+gnWZ5kS5JdSW5JknHqjDtv8Q5LkjSbUT39S4F9VXUWsBLYNGadcedJko6gUaG/EdjayncB549ZZ9x5kqQjaOmI5auAR1r5UeD0MeuMO+9vSHI5cHmb/FqS+0fs4/HiJODho70T48h1R3sPnjR8zo4tx8zzBYf9nL1wrgWjQv9hYEUrr2D2X9hsdZ455ry/oapuBm4esV/HnSQ7q2rd0d4Pjc/n7Nji8zUYNbyzDbiglTcCd49ZZ9x5kqQjaFTo3wqcmmQ3sB94MMn1I+psO4h5kqQjKFV1tPdBDNcy2tCWjhE+Z8cWn6+BoS9JHfETuZLUEUP/CEjyfUlecJDrLE3yjBF1ViZ53uHtnQCSHPbfQpKntJ+rZlnmJ9D1pGDoH4Ykz0/ygVZek+SMJH/Ufp6Z5Jmt6vUMF7G3tcedSe5N8guztPmpVvwe4NoZy1YluSDJhiQbgJ8A3jM1nWRTkhMW6XCPC0lOSfKfW/klSa5oi/5rku9t8z+bZPscj89Pa2tJe3NemmQpcE+SZwG/mOTEGZv+L0le2NY7LclHF/9ojw9Jlh1k/WeNWH5hkr91eHt17Bp1n77m93XgWy3cXwGcADwDWN9+/s8kLwKeVlWfSPI4cGFVPd5CewMMZwLAu1t7q5PcyfC5hucleTGwDHgH8BfAacDjbft/2R6r2/Qy4BOLeLzHg8eBb7byV4FvJFnH8Hx9uQXMI1W1fmqF9hx8oaq+luSPp7X1vtbW/ja9FLgB+GvgHUneCRyoqieA24EXJvlL4AeA21rbSwBaHc3uDUleDbwF+BHgp4CHGH63r55esZ1lbWPoNM3lZ4CLk9wBPB0oYDmwp6oub+18F/BrVfX6JC8F/hz4DeBnge8CPllVBxbuEI8cQ/8QJVkNXMnwPULvAb7IEOLPZ/ieoVTVDUneB/xZW20J8MYkTwBnAN8AqKo/YHijIMkvV9VPJVkDbJzlboP/0M4GHpkx/3lV9aKFPcrjS5K/xxDKL0hyAzAJXMYQyGcAbwN+AXi8DcekhfFVwLVJHgC+Na3JAtYyhPz/Bn4TeAx4NfDuqnosyRuT/BNgAtgD/BDDG/dTk/xom/824PcW7cCPcVX1/iQrgDXAs4GfrqoPJtkyVSfJ/wCew/D3uKp1nKb8VVX9o1bvLIY39f+b5BvA61onbDXD3/OUdwDPT3ItsAN4I0On7JkMHbRXLc7RLj5D/9C9BNjH8CK4gSHwPwC8mOGPe0mSn2PoQUy5iuGFCbAX+Nz0BpOcCaxtY8M/Dvy71nN5RlV9flrVncCnZ+zP3z3sIzrOVdUfJflB4Pqq+lcASc6tqiuTnFFVb23zAM4GrkvybIbf7YsZnufpTgM2M4T/f2QYjtsD/GpV/XHb5m8l+XPgVVX1riRrGXr5L2V4zZxaVQb+HJI8jeHN9B1t1j8G/kErn9bevN9TVT/cztI+Abyyqj4zR5NXA1+aan7m5to2X8HwNQYXt3lfAr7G0Jk7Cbimqh7nGGXoH7rPM7wYzqiqTyf5IYYXyVeAcxjeEH4E+O/AzyY5lWFsf7r9wOsBknw38GHg9VX1rSTPAZ7FcEr5vrY9kvw4wwvylGntPA1YlmR9VW1f+EM9PiQ5DfhB4EVJ/gVwLnBuko8BL02ynRbsVbUjyasYPlT4EPATVbU3ydtaW09h6HV+Gfgdhi8T/NcMw20vSPJaYLKq/pDh+Zkaovgs8CmGN5K/Ddx3BA79mFVVB5L8CkNP+wMMQzevam/M1wJ/WlVTnad3AScC75123fzlwJqq+nKSv88wFDr1fV5/Bny4Dd/8CcPfH1X18Tb0sx34XwyvgX/D0Nl6PbCuPY5Jhv4hqqo/SXLStFlfZ/j20M8wjO1/b1VVkr9uy5/C8NXSb55aIcnvt5/fA/xbhjD4Slu8DPj3wNuq6uPTtvvrwK+39U4A3snwwn4fw4tSc3sqw5DaZ4D3MvweP1hVF7WhgtcwDMH9ZDvdvxa4CfjRWdraCHykPcf/lKHXX1X18vaGch3DGDTAcxmGBD4FXAR8iCE8ngP89iIc53Glqj4JfDLJKxl66gAHGP6mvg8gyRsYvqhxK/Bb01Z/N9++BrYH+GfA1MX7j1fVzyTZUlWbp1ZoQ3u/A/xDhuG+1cD/4dt/X2cs5PEdad69s3CKIQguYejxzzTbhbqpT8Z9miFwvgB8d5L/xvA9RW+eHvhJzkry+0k+nOTDwJ3AP2d4Ht8M/F6SyxboeI47VfUAcEcrF8MZ00PTlj8x7bT9RIY31AmGs6q/mtHWVuAvkuxheLN/H3BFknsYLhS+uvXyAV7JMOb/LOCrLcTOBJ5bVfvRSEmuqqptwPczjOkvB36j3SBxJvBjDIE+mwKoqi8zhPeUt7efp2X4B0/bWkdqGcNZ299pP09gONPe3h7H5AXcKfb0D09mlN9WVR9rt4zd1OY/pS1bAlww4wLT8wCq6pvAN9sp6UMMp6k/yXD2QLvgt6uqdgHn/f8NDmcIl1TV1WhcZwJPtOGZG4Cfa/OnP5fPYjgTmG5re35OnjbvMYaL+I8zjPmuBbYwXI8ZGh1u3TyTYdjnXwJ7kpzStvHUJGvam5Hm0IZlpm5S2A5cneRLtDvVqmpPOwt4JcNddNOfo9PmaPPFDF/3DvD56T395sokZwC/DLyA4drN2rbsmRzDDP3D8zSGXsFUecoVwAPT5n+r1fvojOGdP5jR3nKG29D+tF38uy3Jlxl6mt/xT2eSLAd+DXj/whxKN17N8Du7FPjD1uuG7/xDfryqzp1t5SS72s/XAj/N0Isshuf4GQz/Ee55wEuSvJXhrqxrGYLoGe3xQeCtDHdg3ZLkna0Xq9ldBfxiKz+F4SaIG4ALpyq0YbYAt1bVO6fmJ7mLb59Rw7ff3N8O/HaS9/KdN1tMrbeU4WaNX2V4Y//NqnpXW3ZMX3j3u3cWSHuRPHEk77dO8rRj9V7hJ7Mkz6mqryxS28uAb029TpKk/COcU5LnAv+JIeCvYRj2/ADDRdefZ7gYf0lV3ZvkQuAVU6Gf5Lo2/f3T2ns+8CsMF4B/vt3G+3aGf+p0AvDadmPGBMM1nzcwXDP7gXb31TuAl1fVDx+J418Mhr6kY0KGDzp+bnpHJ8nTgcdme+NcjDfUJE+tqm8sZJtHmqEvSR3x7h1J6oihL0kdMfQlqSOGviR15P8B0j1lIqpSi/UAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "df_plot = df_fission.groupby('裂变类型').是否转化.mean().reset_index() #促销分组的转化率均值\n",
    "plt.bar(df_plot['裂变类型'],df_plot['是否转化']) #不同促销转化均值柱状图"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 创建哑变量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>用户码</th>\n",
       "      <th>曾助力</th>\n",
       "      <th>曾拼团</th>\n",
       "      <th>曾推荐</th>\n",
       "      <th>R值</th>\n",
       "      <th>M值</th>\n",
       "      <th>是否转化</th>\n",
       "      <th>设备_手机</th>\n",
       "      <th>设备_电脑</th>\n",
       "      <th>城市类型_中小</th>\n",
       "      <th>城市类型_二线</th>\n",
       "      <th>裂变类型</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>10</td>\n",
       "      <td>138.00</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "      <td>105.59</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>拼团盛放</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>494.13</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>拼团盛放</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "      <td>148.45</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>337.18</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>拼团盛放</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   用户码  曾助力  曾拼团  曾推荐  R值      M值  是否转化  设备_手机  设备_电脑  城市类型_中小  城市类型_二线  裂变类型\n",
       "0    1    0    1    0  10  138.00     0      0      1        1        0  情侣花享\n",
       "1    2    0    1    0   4  105.59     0      0      1        0        0  拼团盛放\n",
       "2    3    0    1    0   1  494.13     1      0      1        0        0  拼团盛放\n",
       "3    4    0    1    1  10  148.45     0      1      0        0        1  没有促销\n",
       "4    5    1    0    0   2  337.18     0      1      0        0        1  拼团盛放"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_dummies = df_fission.drop(['裂变类型'],axis=1) #在拆分哑变量前，先拿掉裂变类型\n",
    "df_dummies = pd.get_dummies(df_dummies, drop_first = True) #为分类变量拆分哑变量\n",
    "df_dummies['裂变类型'] = df_fission['裂变类型'] #把裂变类型放回去\n",
    "df_fission = df_dummies.copy() #把哑变量数据集复制给元数据集\n",
    "df_fission.head() #显示数据"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 构建特征和标签数据集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_discount = df_fission.query(\"裂变类型 == '情侣花享' | 裂变类型 == '没有促销'\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\jacky.huang\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py:1596: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  self.obj[key] = _infer_fill_value(value)\n",
      "C:\\Users\\jacky.huang\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py:1765: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  isetter(loc, value)\n",
      "C:\\Users\\jacky.huang\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py:1765: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  isetter(loc, value)\n",
      "C:\\Users\\jacky.huang\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py:1765: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  isetter(loc, value)\n",
      "C:\\Users\\jacky.huang\\Anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py:1765: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  isetter(loc, value)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>用户码</th>\n",
       "      <th>曾助力</th>\n",
       "      <th>曾拼团</th>\n",
       "      <th>曾推荐</th>\n",
       "      <th>R值</th>\n",
       "      <th>M值</th>\n",
       "      <th>是否转化</th>\n",
       "      <th>设备_手机</th>\n",
       "      <th>设备_电脑</th>\n",
       "      <th>城市类型_中小</th>\n",
       "      <th>城市类型_二线</th>\n",
       "      <th>裂变类型</th>\n",
       "      <th>标签</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>10</td>\n",
       "      <td>138.00</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>情侣花享</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "      <td>148.45</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>6</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "      <td>56.48</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>情侣花享</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>7</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>551.98</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>情侣花享</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>9</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>7</td>\n",
       "      <td>29.99</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>情侣花享</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   用户码  曾助力  曾拼团  曾推荐  R值      M值  是否转化  设备_手机  设备_电脑  城市类型_中小  城市类型_二线  裂变类型  \\\n",
       "0    1    0    1    0  10  138.00     0      0      1        1        0  情侣花享   \n",
       "3    4    0    1    1  10  148.45     0      1      0        0        1  没有促销   \n",
       "5    6    0    1    1  10   56.48     0      0      1        0        1  情侣花享   \n",
       "6    7    1    1    1   2  551.98     0      0      0        0        0  情侣花享   \n",
       "8    9    1    0    1   7   29.99     1      0      1        0        1  情侣花享   \n",
       "\n",
       "    标签  \n",
       "0  1.0  \n",
       "3  3.0  \n",
       "5  1.0  \n",
       "6  1.0  \n",
       "8  0.0  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_discount.loc[(df_discount.裂变类型 == '情侣花享') & (df_discount.是否转化 == 1), '标签'] = 0 #有应答裂变组,裂变购买者\n",
    "df_discount.loc[(df_discount.裂变类型 == '情侣花享') & (df_discount.是否转化 == 0), '标签'] = 1 #无应答裂变组,裂变未购买者\n",
    "df_discount.loc[(df_discount.裂变类型 == '没有促销') &  (df_discount.是否转化 == 1), '标签'] = 2 #有应答控制组,无裂变购买者\n",
    "df_discount.loc[(df_discount.裂变类型 == '没有促销') &  (df_discount.是否转化 == 0), '标签'] = 3 #无应答控制组,无裂变未购买者\n",
    "df_discount.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = df_discount.drop(['标签','是否转化'],axis=1) #特征集，Drop掉便签相关字段\n",
    "y = df_discount.标签 #标签集"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 拆分数据集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=16)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>用户码</th>\n",
       "      <th>曾助力</th>\n",
       "      <th>曾拼团</th>\n",
       "      <th>曾推荐</th>\n",
       "      <th>R值</th>\n",
       "      <th>M值</th>\n",
       "      <th>设备_手机</th>\n",
       "      <th>设备_电脑</th>\n",
       "      <th>城市类型_中小</th>\n",
       "      <th>城市类型_二线</th>\n",
       "      <th>裂变类型</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>10</td>\n",
       "      <td>138.00</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "      <td>148.45</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>6</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "      <td>56.48</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>7</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>551.98</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>9</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>7</td>\n",
       "      <td>29.99</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>63991</th>\n",
       "      <td>63992</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>509.72</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>63992</th>\n",
       "      <td>63993</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>29.99</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>63993</th>\n",
       "      <td>63994</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>499.62</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>63996</th>\n",
       "      <td>63997</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>158.03</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>63998</th>\n",
       "      <td>63999</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>196.02</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>情侣花享</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>42613 rows × 11 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         用户码  曾助力  曾拼团  曾推荐  R值      M值  设备_手机  设备_电脑  城市类型_中小  城市类型_二线  裂变类型\n",
       "0          1    0    1    0  10  138.00      0      1        1        0  情侣花享\n",
       "3          4    0    1    1  10  148.45      1      0        0        1  没有促销\n",
       "5          6    0    1    1  10   56.48      0      1        0        1  情侣花享\n",
       "6          7    1    1    1   2  551.98      0      0        0        0  情侣花享\n",
       "8          9    1    0    1   7   29.99      0      1        0        1  情侣花享\n",
       "...      ...  ...  ...  ...  ..     ...    ...    ...      ...      ...   ...\n",
       "63991  63992    1    0    1   2  509.72      1      0        0        1  情侣花享\n",
       "63992  63993    1    0    0   1   29.99      1      0        1        0  情侣花享\n",
       "63993  63994    0    1    1   2  499.62      1      0        1        0  情侣花享\n",
       "63996  63997    0    1    1   3  158.03      1      0        0        1  情侣花享\n",
       "63998  63999    1    0    0   1  196.02      0      1        0        1  情侣花享\n",
       "\n",
       "[42613 rows x 11 columns]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\jacky.huang\\Anaconda3\\lib\\site-packages\\xgboost\\sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
      "  warnings.warn(label_encoder_deprecation_msg, UserWarning)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[11:01:18] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.4.0/src/learner.cc:1095: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,\n",
       "              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,\n",
       "              importance_type='gain', interaction_constraints='',\n",
       "              learning_rate=0.300000012, max_delta_step=0, max_depth=6,\n",
       "              min_child_weight=1, missing=nan, monotone_constraints='()',\n",
       "              n_estimators=100, n_jobs=8, num_parallel_tree=1,\n",
       "              objective='multi:softprob', random_state=0, reg_alpha=0,\n",
       "              reg_lambda=1, scale_pos_weight=None, subsample=1,\n",
       "              tree_method='exact', validate_parameters=1, verbosity=None)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import xgboost as xgb #导入xgboost模型\n",
    "xgb_model = xgb.XGBClassifier() #创建xgboost模型\n",
    "xgb_model.fit(X_train.drop(['用户码','裂变类型'], axis=1), y_train) #拟合xgboost模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.10408288, 0.42887658, 0.02090275, 0.44613782],\n",
       "       [0.13568689, 0.47382146, 0.0910695 , 0.29942212],\n",
       "       [0.08241459, 0.39959702, 0.10955084, 0.40843758],\n",
       "       ...,\n",
       "       [0.24106652, 0.37841547, 0.18507037, 0.19544768],\n",
       "       [0.04672149, 0.46512628, 0.02754113, 0.46061108],\n",
       "       [0.02860421, 0.450275  , 0.00923175, 0.51188904]], dtype=float32)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "uplift_probs = xgb_model.predict_proba(X_test.drop(['用户码','裂变类型'], axis=1)) #预测测试集用户的分类概率\n",
    "uplift_probs #显示4种概率"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 求出增量分数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>用户码</th>\n",
       "      <th>曾助力</th>\n",
       "      <th>曾拼团</th>\n",
       "      <th>曾推荐</th>\n",
       "      <th>R值</th>\n",
       "      <th>M值</th>\n",
       "      <th>设备_手机</th>\n",
       "      <th>设备_电脑</th>\n",
       "      <th>城市类型_中小</th>\n",
       "      <th>城市类型_二线</th>\n",
       "      <th>裂变类型</th>\n",
       "      <th>P_TR</th>\n",
       "      <th>P_TN</th>\n",
       "      <th>P_CR</th>\n",
       "      <th>P_CN</th>\n",
       "      <th>增量分数</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>50706</th>\n",
       "      <td>50707</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>10</td>\n",
       "      <td>271.77</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.104083</td>\n",
       "      <td>0.428877</td>\n",
       "      <td>0.020903</td>\n",
       "      <td>0.446138</td>\n",
       "      <td>0.100441</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37013</th>\n",
       "      <td>37014</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>250.76</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.135687</td>\n",
       "      <td>0.473821</td>\n",
       "      <td>0.091070</td>\n",
       "      <td>0.299422</td>\n",
       "      <td>-0.129782</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14685</th>\n",
       "      <td>14686</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>341.75</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.082415</td>\n",
       "      <td>0.399597</td>\n",
       "      <td>0.109551</td>\n",
       "      <td>0.408438</td>\n",
       "      <td>-0.018296</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19737</th>\n",
       "      <td>19738</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "      <td>29.99</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.062630</td>\n",
       "      <td>0.467461</td>\n",
       "      <td>0.026409</td>\n",
       "      <td>0.443500</td>\n",
       "      <td>0.012260</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2581</th>\n",
       "      <td>2582</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>289.61</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.074882</td>\n",
       "      <td>0.360975</td>\n",
       "      <td>0.080043</td>\n",
       "      <td>0.484100</td>\n",
       "      <td>0.117963</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>515</th>\n",
       "      <td>516</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>383.70</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.137546</td>\n",
       "      <td>0.390306</td>\n",
       "      <td>0.062678</td>\n",
       "      <td>0.409471</td>\n",
       "      <td>0.094033</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34459</th>\n",
       "      <td>34460</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>284.90</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.095995</td>\n",
       "      <td>0.416240</td>\n",
       "      <td>0.040642</td>\n",
       "      <td>0.447123</td>\n",
       "      <td>0.086237</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8017</th>\n",
       "      <td>8018</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1093.03</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.241067</td>\n",
       "      <td>0.378415</td>\n",
       "      <td>0.185070</td>\n",
       "      <td>0.195448</td>\n",
       "      <td>-0.126972</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28362</th>\n",
       "      <td>28363</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>9</td>\n",
       "      <td>31.08</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>情侣花享</td>\n",
       "      <td>0.046721</td>\n",
       "      <td>0.465126</td>\n",
       "      <td>0.027541</td>\n",
       "      <td>0.460611</td>\n",
       "      <td>0.014665</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18385</th>\n",
       "      <td>18386</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>10</td>\n",
       "      <td>222.89</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>没有促销</td>\n",
       "      <td>0.028604</td>\n",
       "      <td>0.450275</td>\n",
       "      <td>0.009232</td>\n",
       "      <td>0.511889</td>\n",
       "      <td>0.080986</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>8523 rows × 16 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         用户码  曾助力  曾拼团  曾推荐  R值       M值  设备_手机  设备_电脑  城市类型_中小  城市类型_二线  \\\n",
       "50706  50707    0    1    0  10   271.77      1      0        0        0   \n",
       "37013  37014    0    1    0   3   250.76      0      1        0        0   \n",
       "14685  14686    0    1    0   2   341.75      1      0        0        1   \n",
       "19737  19738    0    1    1  10    29.99      0      1        0        1   \n",
       "2581    2582    0    1    1   2   289.61      0      0        0        1   \n",
       "...      ...  ...  ...  ...  ..      ...    ...    ...      ...      ...   \n",
       "515      516    0    1    0   1   383.70      0      1        0        1   \n",
       "34459  34460    0    1    1   2   284.90      0      1        0        1   \n",
       "8017    8018    1    1    1   2  1093.03      1      0        0        1   \n",
       "28362  28363    1    0    0   9    31.08      1      0        0        1   \n",
       "18385  18386    1    0    1  10   222.89      0      0        0        0   \n",
       "\n",
       "       裂变类型      P_TR      P_TN      P_CR      P_CN      增量分数  \n",
       "50706  没有促销  0.104083  0.428877  0.020903  0.446138  0.100441  \n",
       "37013  没有促销  0.135687  0.473821  0.091070  0.299422 -0.129782  \n",
       "14685  没有促销  0.082415  0.399597  0.109551  0.408438 -0.018296  \n",
       "19737  没有促销  0.062630  0.467461  0.026409  0.443500  0.012260  \n",
       "2581   没有促销  0.074882  0.360975  0.080043  0.484100  0.117963  \n",
       "...     ...       ...       ...       ...       ...       ...  \n",
       "515    没有促销  0.137546  0.390306  0.062678  0.409471  0.094033  \n",
       "34459  没有促销  0.095995  0.416240  0.040642  0.447123  0.086237  \n",
       "8017   没有促销  0.241067  0.378415  0.185070  0.195448 -0.126972  \n",
       "28362  情侣花享  0.046721  0.465126  0.027541  0.460611  0.014665  \n",
       "18385  没有促销  0.028604  0.450275  0.009232  0.511889  0.080986  \n",
       "\n",
       "[8523 rows x 16 columns]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "discount_uplift = X_test.copy() #构建增量分数数据集   \n",
    "discount_uplift['P_TR'] = uplift_probs[:,0] #添加有应答裂变概率\n",
    "discount_uplift['P_TN'] = uplift_probs[:,1] #添加无应答裂变概率\n",
    "discount_uplift['P_CR'] = uplift_probs[:,2] #添加有应答控制概率\n",
    "discount_uplift['P_CN'] = uplift_probs[:,3] #添加无应答控制概率\n",
    "#计算增量分数\n",
    "discount_uplift['增量分数'] = discount_uplift.eval('P_TR + P_CN - (P_TN + P_CR)') \n",
    "# discount_uplift['增量分数'] = discount_uplift.eval('P_CN/(P_CN+P_CR) + P_TR/(P_TN+P_TR) - (P_TN/(P_TN+P_TR) - P_CR/(P_CN+P_CR))') \n",
    "discount_uplift"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
